Output distributional influence function

نویسندگان

  • Sari Peltonen
  • Pauli Kuosmanen
  • Jaakko Astola
چکیده

When selecting a filter for an application, it is often essential to know the behaviour of the filter in presence of contamination. This robustness of a filter is traditionally explored by means of influence function (IF) and change-of-variance function (CVF). However, as these are asymptotic measures there is uncertainty of the applicability of the obtained results to the finite length filters used in the real world applications. This paper disperses this uncertainty by presenting a new method, called output distributional influence function (ODIF), for examining the robustness of the finite length filters. The method gives extensive information about the robustness of any filter with known output distribution function. As examples the ODIFs for distribution function, density function, expectation, and variance are given for the mean and the median filters and interpreted in detail. 1. INFLUENCE FUNCTION Influence function (IF) is a useful heuristic tool of robust statistics introduced by Hampel [2, 3] under the name influence curve (IC) for studying the performance of filters under noisy conditions. Definition 1 . The IF of estimator T at underlying probability distribution F is given by IF(x;T; F ) = lim t!0+ T ((1 t)F + t x) T (F ) t for those x where this limit exists. In this definition x is the probability measure which puts mass 1 at the point x. The IF gives the effect that an infinitesimal contamination at point x has on the estimator T when divided by the mass of the contamination. So the IF gives asymptotic bias caused by the contamination and thus characterizes properties of the estimator as the number of observations approaches infinity. We denote by and the distribution and the density functions of the standard normal distribution. The influence functions for the mean and the median are shown in Figure 1 where the underlying distributionF = . For the mean the gross error sensitivity, i.e., the worst influence which a small amount of contamination of fixed size can have on the value of the estimator, equals infinity and for the median it is finite and equals p 2 1:253. So for the mean single outlier can carry the estimate over all bounds but for the median an outlier has a fixed influence. -4 -2 2 4 -2 -1.5 -1 -0.5 0.5 1 1.5 2 Figure 1: The IF of the mean (–) and the median (-) at F = . 2. CHANGE-OF-VARIANCE FUNCTION The IF gives only one aspect of robustness of an estimator, namely local robustness of the asymptotic value of the estimator. Another important aspect is the local robustness of asymptotic variance. The asymptotic variance of estimator T at F denoted by V (T; F ) is defined to be the variance of p N [T (FN ) T (F )] as N !1, where FN is the empirical distribution of sample (X1;X2; : : : ; XN ). Local robustness of the asymptotic variance can be characterized by the change-of-variance function (CVF) defined as follows, [4]. Definition 2. The CVF of estimator T at F is defined as CVF(x;T; F ) = lim t!0+ V (T; (1 t)F + t x) V (T;F ) t for those x where this limit exists. If F = , the CVF of the mean is x 1 which is displayed in Figure 2. In the same figure is also shown the CVF of the median at F = . It has a constant value p 2 1:253 elsewhere but at zero, where the graph has a negative delta function. If the CVF is negative, the asymptotic variance of the estimator has decreased, and if positive, the asymptotic variance has increased. So for the mean the asymptotic variance decreases if the contamination is in the interval ( 1; 1). The further the contamination is from this interval the more the variance is increased and a single outlier can carry the asymptotic variance over all bounds. For the median contamination at the origin reduces the asymptotic variance significantly and the contamination anywhere else causes only a constant increase to the variance of the median. So the median is robust also in this sense.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ODIF for morphological filters

In this paper our recently introduced method called output distributional influence function (ODIF) is used for the evaluation of the robustness properties of the morphological filters. Several examples of the ODIFs for the dilation, the closing, and the clos-opening are given and explained carefully. For each of these morphological filters the effect of filter length is examined by using the O...

متن کامل

Combining Syntactic Co-occurrences and Nearest Neighbours in Distributional Methods to Remedy Data Sparseness.

The task of automatically acquiring semantically related words have led people to study distributional similarity. The distributional hypothesis states that words that are similar share similar contexts. In this paper we present a technique that aims at improving the performance of a syntax-based distributional method by augmenting the original input of the system (syntactic co-occurrences) wit...

متن کامل

Record Range of Uniform Distribution

We consider a sequence of independent and identicaly distributed (iid) random variables with absolutely continuous distribution function F(x) and probability density function (pdf) f(x). Let Rnl be the largest observation after observing nth record and R(ns) be the smallest observation after observing the nth record. Then we say Wnr = Rnl− R(ns), n > 1, as the nth record range. We will c...

متن کامل

A View on Robustness

In this paper we give a short overview of some statistical robustness tools used in signal processing. This paper is not meant to be a complete survey on robustness. Rather it reflects author’s viewpoint on robustness; that is, a viewpoint based on signal processing and statistical estimation theory. We also briefly consider robustness in biological systems. 1. ROBUSTNESS IN SIGNAL PROCESSING W...

متن کامل

Top a Splitter: Using Distributional Semantics for Improving Compound Splitting

We present a flexible method that rearranges the ranked output of compound splitters (i.e., decomposers of one-word compounds such as the German Kinderlied ‘children’s song’) using a distributional semantics model. In an experiment, we show that our re-ranker improves the quality of various compound splitters.

متن کامل

Detecting Learner Errors in the Choice of Content Words Using Compositional Distributional Semantics

We describe a novel approach to error detection in adjective–noun combinations. We present and release a new dataset of annotated errors where the examples are extracted from learner texts and annotated with error types. We show how compositional distributional semantic approaches can be applied to discriminate between correct and incorrect word combinations from learner data. Finally, we show ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Signal Processing

دوره 49  شماره 

صفحات  -

تاریخ انتشار 1999